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SuccessMaker® 

Program Description 1 

SuccessMaker® is a set of computer-based courses designed 
to supplement regular K-8 reading instruction. The program aims 
to improve skills in areas such as phonological awareness, phonics, 
fluency, vocabulary, comprehension, concepts of print, grammar, 
and spelling. The software adapts instruction to match students’ 
skill level and progress. “Foundations” courses contain basic skill- 
building exercises, while “Exploreware” courses focus on reading and 
writing activities aimed at building higher level analytical skills. The 
program analyzes students’ progress and assigns specific segments 
of the lesson, introducing new skills as they become appropriate. As 
the student progresses, an algorithm calculates the probability of the 
student answering the next exercise correctly, which determines 
the next steps of the lesson. 

Research 2 

The What Works Clearinghouse (WWC) identified one study of 
SuccessMaker® that falls within the scope of the Adolescent Literacy 
topic area and meets WWC group design standards. This study 
meets WWC group design standards without reservations. This study 
included 1 ,094 adolescent readers in grades 5 and 7 in nine schools 
located in seven states across the United States. 
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This intervention report presents 
findings from a systematic review 
of SuccessMaker® conducted using 
the WWC Procedures and Standards 
Handbook, version 3.0, and the Adolescent 
Literacy review protocol, version 3.0. 


The WWC considers the extent of evidence for SuccessMaker® on the 

reading skills of adolescent readers to be small for two student outcome domains— comprehension and reading 
fluency. There were no studies that met standards in the two other domains, so this intervention report does not report 
on the effectiveness of SuccessMaker® for those domains. (See the Effectiveness Summary on p. 4 for more details 
of effectiveness by domain.) 


Effectiveness 

SuccessMaker® was found to have no discernible effects on comprehension and reading fluency for adolescent readers. 


Table 1. Summary of findings 3 




Improvement index (percentile points) 




Outcome domain 

Rating of effectiveness 

Average 

Range 

Number of 
studies 

Number of 
students 

Extent of 
evidence 

Comprehension 

No discernible effects 

+3 

+2 to +5 

1 

1,094 

Small 

Reading fluency 

No discernible effects 

-3 

-4 to -1 

1 

1,087 

Small 


SuccessMaker® Updated November 201 5 


Page 1 











WWC Intervention Report 


Program Information 

Background 

SuccessMaker® was developed by Patrick Suppes at Stanford University and Mario Zanotti at the Computer 
Curriculum Corporation in the late 1960s (Suppes & Zanotti, 1996). 4 The program is now owned and distributed 
by Pearson Education. Earlier versions of the program were called SuccessMaker® Enterprise and Computer 
Curriculum Corporation (CCC) SuccessMaker®. The most current version of the program is called Success- 
Maker® 8. Address: One Lake Street, Upper Saddle River, NJ 07458. Email: communications@pearsoned.com. 
Web: www.pearsoned.com. Telephone: (201) 236-7000. 

Program details 

The SuccessMaker® software, also referred to as an integrated learning system by authors of studies included 
in this review, is a supplemental program used in conjunction with existing language arts curricula. The program 
includes an instructional management system, formative assessments, a reporting system with information on 
student progress, and individualized reading curriculum resources for elementary and middle school instruction. 
Program activities (practice, tutoring, and games) are based on selections from classic literature for children and 
adolescents. Initial program courses, “Foundations,” contain basic skill-building exercises, while “Exploreware” 
courses focus on reading and writing activities aimed at building higher level analytical skills. The program offers 
approximately 43 hours of instruction per grade. Each student progresses through the computerized lessons at his 
or her own pace. The program individualizes instruction and provides real-time feedback and tutorials for students 
who encounter challenges during a lesson. If a student continually struggles with a new concept, SuccessMaker® 
sets the material aside to be reintroduced at a later point. SuccessMaker® also periodically checks the student’s 
recollection of previously mastered material. Professional development for using SuccessMaker® is available and 
focuses on instructional strategies to incorporate SuccessMaker® into the curricula and customized on-site support 
for teachers. 


Cost 

Cost information for SuccessMaker® is available from the distributor. 
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Research Summary 

The WWC identified nine eligible studies that investigated the effects of 
SuccessMaker® on the reading skills of adolescent readers. An additional 
57 studies were identified but do not meet WWC eligibility criteria for 
review in this topic area. Citations for all 66 studies are in the References 
section, which begins on p. 6. 

The WWC reviewed nine eligible studies against group design standards. 

One study (Gatti, 2011) is a randomized controlled trial that meets WWC group design standards without reservations. 
This study is summarized in this report. Eight studies do not meet WWC group design standards. 

Summary of study meeting WWC group design standards without reservations 

Gatti (2011) conducted a cluster randomized controlled trial that examined the effects of SuccessMaker® 
on fifth- and seventh-grade students 5 attending nine schools in Arizona, California, Indiana, Kansas, Michigan, 
Missouri, and Texas. Within each school, English language arts classes 6 were randomly assigned either to 
receive the SuccessMaker® program as a supplement to current instruction or to receive the schools’ regular 
English language arts program. The WWC based its effectiveness ratings on findings from the 641 fifth-grade 
students who participated in the study; 342 students in the SuccessMaker® group and 299 students in the 
regular English language arts program, and the 453 seventh-grade students who participated in the study; 

254 students in the SuccessMaker® group and 199 students in the regular English language arts program. 

The study reported student outcomes after 1 year of program implementation. 

Summary of studies meeting WWC group design standards with reservations 

No studies of SuccessMaker® met WWC group design standards with reservations. 


Table 2. Scope of reviewed research 


Grades 

5,7 

Delivery method 

Individual 

Program type 

Supplement 
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Effectiveness Summary 

The WWC review of SuccessMaker® for the Adolescent Literacy topic area includes student outcomes in four domains: 
alphabetics, comprehension, general literacy achievement, and reading fluency. The one study of SuccessMaker® 
that meets WWC group design standards reported findings in two of the four domains: (a) comprehension and 
(b) reading fluency. The findings below present the author’s estimates and WWC-calculated estimates of the size 
and statistical significance of the effects of SuccessMaker® on adolescent readers. Additional comparisons are 
presented as supplemental findings in the appendix. The supplemental findings do not factor into the intervention’s 
rating of effectiveness. For a more detailed description of the rating of effectiveness and extent of evidence criteria, 
see the WWC Rating Criteria on p. 19. 

Summary of effectiveness for the comprehension domain 

One study that meets WWC group design standards without reservations reported findings in the comprehension domain. 

Gatti (2011) found that SuccessMaker® had statistically significant positive effects on the Overall Score, Passage 
Comprehension, and Sentence Comprehension subtests of the Group Reading Assessment and Diagnostic Evalua- 
tion (GRADE) for fifth- and seventh-grade students when compared to the regular English language arts program 
alone. The WWC could not confirm the statistical significance of these findings after adjusting of the clustering of 
students within classrooms. The average effect size across the two grades on the GRADE overall score was not large 
enough to be considered substantively important, according to WWC criteria. The WWC characterizes these study 
findings as an indeterminate effect. 

Thus, for the comprehension domain, one study showed indeterminate effects. This results in a rating of no discernible 
effects, with a small extent of evidence. 


Table 3. Rating of effectiveness and extent of evidence for the comprehension domain 


Rating of effectiveness 

Criteria met 

No discernible effects 

No affirmative evidence of effects. 

In the one study that reported findings, the estimated impact of the intervention on outcomes in the comprehension 
domain was neither statistically significant nor large enough to be substantively important. 

Extent of evidence 

Criteria met 

Small 

One study that included 1 ,094 students in nine schools reported evidence of effectiveness in the comprehension 
domain. 


Summary of effectiveness for the reading fluency domain 

One study that meets WWC group design standards without reservations reported findings in the reading 
fluency domain. 

Gatti (2011) found that SuccessMaker® had a statistically significant negative effect on the AIMSweb Reading 
Curriculum-Based Measurement (AIMSweb R-CBM) number of words read correctly (WRC) for fifth-grade students 
when compared to the regular English language arts program alone. The WWC could not confirm the statistical 
significance of this finding after adjusting for the clustering of students within classrooms. The author did not find 
statistically significant effects of SuccessMaker® on the AIMSweb R-CBM WRC for seventh-grade students when 
compared to the regular English language arts program alone. The WWC confirmed the lack of statistical significance 
of this finding. The average effect size across the two grades was not large enough to be considered substantively 
important, according to WWC criteria. The WWC characterizes these study findings as an indeterminate effect. 
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Thus, for the reading fluency domain, one study showed indeterminate effects. This results in a rating of no discernible 
effects, with a small extent of evidence. 


Table 4. Rating of effectiveness and extent of evidence for the reading fluency domain 


Rating of effectiveness 

Criteria met 

No discernible effects 

No affirmative evidence of effects. 

In the one study that reported findings, the estimated impact of the intervention on outcomes in the reading fluency 
domain was neither statistically significant nor large enough to be substantively important. 

Extent of evidence 

Criteria met 

Small 

One study that included 1,087 students in five schools reported evidence of effectiveness in the reading fluency domain. 
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Appendix A: Research details for Gatti (2011) 

Gatti, G. (2011). Pearson SuccessMaker reading efficacy study: 2010-11 final report. Pittsburgh, PA: 


Gatti Evaluation Inc. 

Table A. Summary of findings Meets WWC group design standards without reservations 




Study findings 

Outcome domain 

Sample size 

Average improvement index 
(percentile points) 

Statistically significant 

Comprehension 

1,094 students 

+3 

No 

General literacy achievement 

1,087 students 

-3 

No 


Setting The study was conducted in eight urban and suburban school districts located in seven 
states: Arizona, California, Indiana, Kansas, Michigan, Missouri, and Texas. 


Study sample Nine schools participated in the study. The schools had to meet the following conditions: 

they had to (1) have no prior exposure to SuccessMaker ®; (2) have at least two teachers per 
study grade level; (3) be geographically diverse; (4) agree that teachers would uphold random 
assignment; and (5) agree that intervention group classrooms would have their students use 
SuccessMaker® at least 1 hour per week. 

English language arts classes (or sections) within the selected schools and grade levels were 
randomly assigned to either the intervention or the business-as-usual comparison group. 7 The 
fifth-grade sample included 16 classrooms implementing SuccessMaker® and 14 implementing 
the school’s regular English language arts program. The seventh-grade sample included 1 1 
classrooms implementing SuccessMaker® and nine implementing the school’s regular English 
language arts program. 

Of the 641 fifth-grade students that participated in the study, 342 received SuccessMaker® 
and 299 received the school’s regular English language arts program. Of the 453 seventh- 
grade students that participated in the study, 254 received SuccessMaker® and 1 99 received 
the school’s regular English language arts program. 

About 48% of the total sample were male, 39% were minority (about 23.6% Hispanic and 
15.8% African American), and 100% received free or reduced-price lunch. 


Intervention SuccessMaker® is an adaptive, computer-based learning program which includes an instructional 

group management system, formative assessments, a progress reporting system, and individualized 

reading curriculum resources for elementary and middle school instruction. For this study, the 
program was typically implemented with the entire class in a computer laboratory during the 
regular reading instruction time. Intervention group students were expected to use the software 
for at least 1 hour each week. Over the course of the school year, most teachers went to the lab 
two or three times a week, with a median time of 22 hours for fifth-grade classes and 1 8 hours 
for seventh-grade classes. 
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Comparison 

group 


Outcomes and 
measurement 


Support for 
implementation 


Comparison classes received the business-as-usual English language arts instruction, which 
generally did not involve computer-based instruction. The majority of students in fifth grade 
(62%) received instruction from four widely-used published reading programs, while the 
remainder received instruction from non-published teacher-developed curricula. In contrast, 
the majority of students in seventh grade (63%) received instruction from non-published, 
largely teacher-created curricula, while the remainder received instruction from three different 
widely-used published literacy programs. 

Assessments were administered at the onset of the intervention and in the last month of the 
school year. Outcomes in the comprehension domain included Group Reading Assessment 
and Diagnostic Evaluation (GRADE) Overall Score and three subtest scores of the GRADE; 
Passage Comprehension, Sentence Comprehension, and Vocabulary. One measure in the reading 
fluency domain was administered, the AIMSweb Reading Curriculum-Based Measurement. 
Supplemental findings for the three GRADE subtest outcomes and for subgroups of students on 
the GRADE overall score and the AIMSweb outcome are presented in Appendix D. 8 The supple- 
mental findings do not factor into the intervention’s rating of effectiveness. For a more detailed 
description of these outcome measures, see Appendix B. 

The study also included a researcher-designed student reading attitude survey. However, this 
outcome is not eligible for review based on the Adolescent Literacy review protocol (version 3.0). 

Teachers received 1 day of initial training, some before school started and some in the second 
month of the school year. The teachers also received a 3-hour follow-up training and additional 
assistance from Pearson when needed. 
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Appendix B: Outcome measures for each domain 


Comprehension 

Group Reading Assessment and 
Diagnostic Evaluation (GRADE) Overall 
Score 

This test is an untimed standardized group-administered assessment of reading comprehension published by 
Pearson Assessment. The overall score is based on scores on the Passage Comprehension, Sentence Compre- 
hension, and Vocabulary subtests. Two parallel forms were administered in the study for each grade level: Form 
A at baseline and Form B at the end of school year. The test has reliability coefficients of .93 and above for the 
overall score (as cited in Gatti, 2011). 

Reading comprehension 

GRADE Passage Comprehension 
Subtest 

This 19-item assessment measures students' understanding of extended text through explicit and implicit 
multiple-choice questions requiring questioning, predicting, summarizing, and clarifying information from several 
paragraphs. Reliability coefficients ranged from .77 to .85 for the three GRADE subtests (as cited in Gatti, 

2011). This outcome is included as a supplemental finding. 

GRADE Sentence Comprehension 
Subtest 

This 30-item assessment measures students' ability to understand a given sentence as a complete thought, by 
asking testers to identify the missing word in a sentence, the grammatical complexity of which varies by reading 
level. Reliability coefficients ranged from .77 to .85 for the three GRADE subtests (as cited in Gatti, 2011). This 
outcome is included as a supplemental finding. 

Vocabulary development 

GRADE Vocabulary Subtest 

This assessment measures students' knowledge and understanding of words. Students are asked to select the 
correct meaning of targeted words presented in short sentences or phrases. The subtest is comprised of 35 
questions for grade 5 (level 5) and 40 questions for grade 7 (level M). Reliability coefficients ranged from .77 to 
.85 for the three GRADE subtests (as cited in Gatti, 2011). This outcome is included as a supplemental finding. 

Reading fluency 

AIMSweb Reading Curriculum-Based 
Measurement 

This assessment measures the number of words that a student reads accurately from a specified passage. 

The student reads three passages, each for 1 minute, and the middle score is recorded. Bivariate correlation 
coefficients between the three passages for all grades and testing sessions ranged from .89 to .93. The test is 
published by Pearson Assessment (as cited in Gatti, 2011). 
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Appendix C.1: Findings included in the rating for the comprehension domain 


Mean 

(standard deviation) WWC calculations 


Outcome measure 

Study 

sample 

Sample 

size 

Intervention 

group 

Comparison 

group 

Mean 

difference 

Effect 

size 

Improvement 

index 

p-value 

Gatti, 2011 a 

Group Reading Assessment 
and Diagnostic Evaluation 
(GRADE) Overall Score 

Grade 5 

30 classrooms/ 
641 students 

60.59 

(12.93) 

60.03 

(12.75) 

0.56 

0.04 

+2 

<,00 

GRADE Overall Score 

Grade 7 

20 classrooms/ 
453 students 

54.56 

(14.00) 

52.70 

(16.07) 

1.86 

0.12 

+5 

<.00 


Domain average for comprehension (Gatti, 2011) 0.08 +3 Not 

statistically 

significant 


Domain average for comprehension across all studies 0.08 +3 na 


Table Notes: For mean difference, effect size, and improvement index values reported in the table, a positive number favors the intervention group and a negative number favors 
the comparison group. The effect size is a standardized measure of the effect of an intervention on outcomes, representing the average change expected for all individuals who are 
given the intervention (measured in standard deviations of the outcome measure). The improvement index is an alternate presentation of the effect size, reflecting the change in 
an average individual’s percentile rank that can be expected if the individual is given the intervention. The WWC-computed average effect size is a simple average rounded to two 
decimal places; the average improvement index is calculated from the average effect size. The statistical significance of the study’s domain average was determined by the WWC. 
Some statistics may not sum as expected due to rounding, na = not applicable. 

a For Gatti (2011), data reported in the table were obtained through author queries. The p-values presented here were reported in the original study. A correction for clustering was 
needed for all outcomes and resulted in WWC-computed p-values of .81 and .57 for grades 5 and 7, respectively; therefore, the WWC does not find any of the results to be statistically 
significant. The WWC calculated the program group means using a difference-in-differences approach (see the WWC Procedures and Standards Handbook) by adding the impact of 
the program (i.e., difference in mean gains between the intervention and comparison groups) to the unadjusted comparison group posttest means. Please see the WWC Procedures 
and Standards Handbook (version 3.0) for more information. This study is characterized as having no discernible effects because the estimated impacts of the intervention on out- 
comes in the comprehension domain for fifth- and seventh-grade students were neither statistically significant nor large enough to be substantively important. For more information, 
please refer to the WWC Procedures and Standards Handbook (version 3.0), p. 26. 
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Appendix C.2: Findings included in the rating for the reading fluency domain 


Mean 

(standard deviation) WWC calculations 


Outcome measure 

Study 

sample 

Sample 

size 

Intervention 

group 

Comparison 

group 

Mean 

difference 

Effect 

size 

Improvement 

index 

p-value 

Gatti, 2011 a 

AIMSweb Reading 
Curriculum-Based 
Measurement 

Grade 5 

30 classrooms/ 
639 students 

152.20 

(38.01) 

156.01 

(37.63) 

-3.81 

-0.10 

-4 

<,00 

AIMSweb Reading 
Curriculum-Based 

Grade 7 

20 classrooms/ 
448 students 

164.47 

(34.24) 

165.37 

(38.03) 

-0.90 

-0.02 

-1 

>.05 


Measurement 

Domain average for reading fluency (Gatti, 2011) -0.06 -3 Not 

statistically 

significant 


Domain average for reading fluency across all studies -0.06 -3 na 


Table Notes: For mean difference, effect size, and improvement index values reported in the table, a positive number favors the intervention group and a negative number favors 
the comparison group. The effect size is a standardized measure of the effect of an intervention on outcomes, representing the average change expected for all individuals who are 
given the intervention (measured in standard deviations of the outcome measure). The improvement index is an alternate presentation of the effect size, reflecting the change in 
an average individual's percentile rank that can be expected if the individual is given the intervention. The WWC-computed average effect size is a simple average rounded to two 
decimal places; the average improvement index is calculated from the average effect size. The statistical significance of the study’s domain average was determined by the WWC. 
Some statistics may not sum as expected due to rounding, na = not applicable. 

a For Gatti (201 1 ), data reported in the table were obtained through author queries. The p-values presented here were reported in the original study. A correction for clustering 
was needed for all outcomes and resulted in WWC-computed p-values of .91 and .58, for grades 5 and 7, respectively; therefore, the WWC does not find any of the results to be 
statistically significant. The WWC calculated the program group mean using a difference-in-differences approach (see the WWC Procedures and Standards Handbook) by adding 
the impact of the program (i.e., difference in mean gains between the intervention and comparison groups) to the unadjusted comparison group posttest means. Please see the 
WWC Procedures and Standards Handbook (version 3.0) for more information. This study is characterized as having no discernible effects because the estimated impacts of the 
intervention on outcomes in the reading fluency domain for fifth- and seventh-grade students were neither statistically significant nor large enough to be substantively important. 
For more information, please refer to the WWC Procedures and Standards Handbook (version 3.0), p. 26. 
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Appendix D.1: Description of supplemental findings for the comprehension domain 


Mean 

(standard deviation) WWC calculations 


Outcome measure 

Study 

sample 

Sample 

size 

Intervention 

group 

Comparison 

group 

Mean 

difference 

Effect 

size 

Improvement 

index 

p-value 

Gatti, 2011 a 

Group Reading Assessment 
and Diagnostic Evaluation 
(GRADE) Passage 
Comprehension Subtest 

Grade 5 

30 classrooms/ 
641 students 

23.40 

(4.89) 

23.04 

(4.97) 

0.36 

0.07 

+3 

<,00 

GRADE Sentence 
Comprehension Subtest 

Grade 5 

30 classrooms/ 
641 students 

15.81 

(3.48) 

15.48 

(3.35) 

0.33 

0.10 

+4 

<.00 

GRADE Vocabulary Subtest 

Grade 5 

30 classrooms/ 
641 students 

21.39 

(6.01) 

21.51 

(6.09) 

-0.12 

-0.02 

-1 

.34 

GRADE Passage 
Comprehension Subtest 

Grade 7 

30 classrooms/ 
453 students 

20.17 

(5.38) 

19.78 

(6.02) 

0.39 

0.07 

+3 

<.02 

GRADE Sentence 
Comprehension Subtest 

Grade 7 

20 classrooms/ 
453 students 

14.01 

(3.85) 

13.29 

(4.32) 

0.72 

0.18 

+7 

<,00 

GRADE Vocabulary Subtest 

Grade 7 

20 classrooms/ 
453 students 

20.37 

(6.16) 

19.63 

(7.09) 

0.74 

0.11 

+4 

<,00 


Grade 5 subgroups 


Low achieving 


GRADE Overall Score 

Grade 5 

30 classrooms/ 
130 students 

44.36 

(10.11) 

45.34 

(11.54) 

-0.98 

-0.09 

-4 

>.05 

African American 









GRADE Overall Score 

Grade 5 

30 classrooms/ 
44 students 

51.15 

(13.94) 

50.65 

(13.51) 

0.50 

0.04 

+1 

>.05 

Hispanic 









GRADE Overall Score 

Grade 5 

30 classrooms/ 
207 students 

57.43 

(10.82) 

55.48 

(11.23) 

1.75 

0.16 

+6 

<.05 

Male 









GRADE Overall Score 

Grade 5 

30 classrooms/ 
298 students 

61.28 

(12.55) 

60.31 

(12.64) 

0.97 

0.08 

+3 

>.05 

Female 









GRADE Overall Score 

Grade 5 

30 classrooms/ 
343 students 

60.01 

(13.30) 

59.80 

(12.87) 

0.21 

0.02 

+1 

>.05 

Reduced-price lunch 









GRADE Overall Score 

Grade 5 

30 classrooms/ 
286 students 

57.26 

(11.32) 

55.32 

(12.53) 

1.94 

0.16 

+6 

<.05 

Grade 7 subgroups 

Low achieving 









GRADE Overall Score 

Grade 7 

20 classrooms/ 
144 students 

39.89 

(9.96) 

37.33 

(9.95) 

2.56 

0.26 

+10 

<,05 

African American 









GRADE Overall Score 

Grade 7 

20 classrooms/ 
129 students 

46.61 

(13.64) 

45.13 

(14.45) 

1.48 

0.10 

+4 

<.05 
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Mean 

(standard deviation) WWC calculations 


Outcome measure 

Study 

sample 

Sample 

size 

Intervention 

group 

Comparison 

group 

Mean 

difference 

Effect 

size 

Improvement 

index 

p-value 

Male 









GRADE Overall Score 

Grade 7 

20 classrooms/ 

60.22 

57.70 

2.52 

0.16 

+6 

<.05 



222 students 

(14.70) 

(17.43) 





Female 









GRADE Overall Score 

Grade 7 

20 classrooms/ 

54.96 

53.66 

1.30 

0.09 

+4 

<,05 



231 students 

(13.22) 

(14.68) 





Reduced-price lunch 









GRADE Overall Score 

Grade 7 

20 classrooms/ 

47.50 

46.08 

1.42 

0.10 

+4 

<.05 



239 students 

(13.76) 

(14.73) 






Table Notes: The supplemental findings presented in this table are additional findings from studies in this report that meet WWC design standards without reservations, but do not 
factor into the determination of the intervention rating. For mean difference, effect size, and improvement index values reported in the table, a positive number favors the interven- 
tion group and a negative number favors the comparison group. The effect size is a standardized measure of the effect of an intervention on outcomes, representing the average 
change expected for all individuals who are given the intervention (measured in standard deviations of the outcome measure). The improvement index is an alternate presentation 
of the effect size, reflecting the change in an average individual's percentile rank that can be expected if the individual is given the intervention. Some statistics may not sum as 
expected due to rounding. 

a For Gatti (2011), data reported in the table were obtained through author queries. The p-values presented here were reported in the original study. A correction for clustering was 
needed for all outcomes and resulted in WWC-computed p-values ranging from .31 to 92; therefore, the WWC does not find any of the results to be statistically significant. The 
WWC calculated the program group mean using a difference-in-differences approach (see the WWC Procedures and Standards Handbook) by adding the impact of the program (i.e., 
difference in mean gains between the intervention and comparison groups) to the unadjusted comparison group posttest means. Please see the WWC Procedures and Standards 
Handbook (version 3.0) for more information. 

Appendix D.2: Description of supplemental findings for the reading fluency domain 


Mean 

(standard deviation) WWC calculations 



Study 

Sample 

Intervention 

Comparison 

Mean 

Effect 

Improvement 


Outcome measure 

sample 

size 

group 

group 

difference 

size 

index 

p-value 

Gatti, 2011 a 

Grade 5 subgroups 

Low achieving 

AIMSweb Reading 

Grade 5 

30 classrooms/ 

116.63 

118.85 

-2.22 

-0.07 

-3 

>.05 

Curriculum-Based 

Measurement 


129 students 

(29.70) 

(31.70) 





African American 









AIMSweb Reading 

Grade 5 

30 classrooms/ 

143.74 

133.52 

10.22 

0.25 

+10 

<,05 

Curriculum-Based 

Measurement 


44 students 

(34.71) 

(43.23) 





Hispanic 

AIMSweb Reading 

Grade 5 

30 classrooms/ 

139.32 

145.04 

-5.72 

-0.18 

-7 

<.05 

Curriculum-Based 

Measurement 


205 students 

(32.59) 

(31.32) 





Male 









AIMSweb Reading 

Grade 5 

30 classrooms/ 

153.23 

156.95 

-3.72 

-0.10 

-4 

<.05 

Curriculum-Based 


297 students 

(37.50) 

(38.01) 






Measurement 
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Mean 

(standard deviation) WWC calculations 



Study 

Sample 

Intervention 

Comparison 

Mean 

Effect 

Improvement 


Outcome measure 

sample 

size 

group 

group 

difference 

size 

index 

p-value 

Female 









AIMSweb Reading 

Grade 5 

30 classrooms/ 

151.71 

155.23 

-3.52 

-0.09 

-4 

<.05 

Curriculum-Based 

Measurement 


342 students 

(38.33) 

(37.42) 





Reduced-price lunch 

AIMSweb Reading 

Grade 5 

30 classrooms/ 

138.40 

141.69 

-3.29 

-0.10 

-4 

<.05 

Curriculum-Based 

Measurement 


284 students 

(33.52) 

(33.98) 





Grade 7 subgroups 

Low achieving 

AIMSweb Reading 

Grade 7 

20 classrooms/ 

134.57 

139.23 

-4.66 

-0.16 

-7 

>.05 

Curriculum-Based 

Measurement 


141 students 

(28.76) 

(27.80) 





African American 









AIMSweb Reading 

Grade 7 

20 classrooms/ 

150.93 

153.28 

-2.35 

-0.07 

-3 

<.05 

Curriculum-Based 

Measurement 


128 students 

(33.92) 

(32.11) 





Hispanic 

AIMSweb Reading 

Grade 7 

20 classrooms/ 

146.74 

149.71 

-2.97 

-0.10 

-4 

<,05 

Curriculum-Based 

Measurement 


51 students 

(34.99) 

(23.31) 





Male 









AIMSweb Reading 

Grade 7 

20 classrooms/ 

157.82 

159.42 

-1.60 

-0.05 

-2 

>.05 

Curriculum-Based 

Measurement 


219 students 

(32.99) 

(36.54) 





Female 









AIMSweb Reading 

Grade 7 

20 classrooms/ 

170.84 

170.97 

-0.13 

0 

0 

>.05 

Curriculum-Based 
Measurement Evaluation 


229 students 

(35.53) 

(38.73) 





Reduced-price lunch 

AIMSweb Reading 

Grade 7 

20 classrooms/ 

151.06 

152.51 

-1.45 

-0.05 

-2 

>.05 

Curriculum-Based 


23 students 

(32.13) 

(30.55) 






Measurement 

Table Notes: The supplemental findings presented in this table are additional findings from studies in this report that meet WWC design standards without reservations but do not 
factor into the determination of the intervention rating. For mean difference, effect size, and improvement index values reported in the table, a positive number favors the interven- 
tion group and a negative number favors the comparison group. The effect size is a standardized measure of the effect of an intervention on outcomes, representing the average 
change expected for all individuals who are given the intervention (measured in standard deviations of the outcome measure). The improvement index is an alternate presentation 
of the effect size, reflecting the change in an average individual's percentile rank that can be expected if the individual is given the intervention. Some statistics may not sum as 
expected due to rounding. 

a For Gatti (2011), data reported in the table were obtained through author queries. The p-values presented here were reported in the original study. A correction for clustering was 
needed for all outcomes and resulted in WWC-computed p-values ranging from .46 to .99; therefore, the WWC does not find any of the results to be statistically significant. The 
WWC calculated the program group mean using a difference-in-differences approach (see the WWC Procedures and Standards Handbook) by adding the impact of the program (i.e., 
difference in mean gains between the intervention and comparison groups) to the unadjusted comparison group posttest means. Please see the WWC Procedures and Standards 
Handbook (version 3.0) for more information. 
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Endnotes 

1 The descriptive information for this program was obtained from a publicly available source: the program’s website (http://www. 
pearsonschool.com, downloaded November 2014). The WWC requests distributors review the program description sections for 
accuracy from their perspective. The program description was provided to the distributor in November 2014; however, the WWC 
received no response. Further verification of the accuracy of the descriptive information for this program is beyond the scope of 
this review. 

2 The literature search reflects documents publicly available by October 2014, and represents an updated systematic review and 
assessment of the available studies. The previous intervention report was released in June 2009 and included studies reviewed under 
the WWC evidence standards, (version 1 .0). This report has been updated to include reviews of 34 studies that have been released 
since 2009. Of the additional studies, 30 were not within the scope of the review protocol for the Adolescent Literacy topic area, 

and three were within the scope of the review protocol but did not meet WWC group design standards. One study met WWC group 
design standards without reservations, and findings from this study are highlighted in this report. In addition, three studies (Beattie, 
2000; Campbell, 2000; and Gallagher, 1996), which met WWC design standards with reservations in the previous report, do not meet 
WWC design standards in this report. These revised dispositions are due to changes in the design standards and the Adolescent 
Literacy review protocol. In particular, for Campbell (2000), a statistical adjustment for baseline differences between 0.05 and 0.25 is 
required in order to meet the baseline equivalence requirement. For Gallagher (1996), baseline differences either required a statistical 
adjustment for outcome analyses in grades 4, 5, and the full sample, or exceeded 0.25 standard deviation for outcome analyses in 
grades 6 and 7. For Beattie (2000), the randomized controlled trial analysis included outcomes that had a combination of overall and 
differential attrition rates that exceeded the WWC standards (version 3.0), and the subsequent analytic intervention and comparison 
groups were not shown to be equivalent. A complete list and disposition of all studies reviewed are provided in the references. The 
studies in this report were reviewed using the Standards from the WWC Procedures and Standards Handbook (version 3.0), along with 
those described in the Adolescent Literacy review protocol (version 3.0). The evidence presented in this report is based on available 
research. Findings and conclusions may change as new research becomes available. 

3 For criteria used in the determination of the rating of effectiveness and extent of evidence, see the WWC Rating Criteria on p. 1 9. These 
improvement index numbers show the average and range of individual-level improvement indices for all findings across the studies. 

4 Suppes, R, & Zanotti, M. (1996). Mastery learning of elementary mathematics: Theory and data. In P. Suppes & M. Zanotti (Eds.), 
Foundations of probability with applications (pp. 149-188). New York: Cambridge University Press. 

5 The study also included third-grade students, which are out of scope of the Adolescent Literacy review protocol (version 3.0). 

6 Intact classrooms of students (clusters), as opposed to individual students, were randomly assigned to intervention conditions. 

7 In some schools, fifth- or seventh-grade teachers taught multiple sections of English language arts, and each section was randomly 
assigned, as there are references in the study to some teachers teaching both SuccessMaker ® and comparison sections. 

8 Supplemental findings are presented for two domains in Appendix D. For the comprehension domain, findings are presented for 
the three GRADE subtests (Passage Reading Comprehension, Sentence Comprehension, and Vocabulary). For both domains, com- 
prehension and reading fluency, findings are also presented for the following subgroups: low-achieving students, African-American 
students, Hispanic students, male students, female students, and students who qualified for reduced-price lunch. Note that analyses 
for seventh-grade Hispanic students met WWC group design standards with reservations for the AIMSweb Words Read Correctly 
outcome because there was high student attrition, but the intervention and comparison groups were equivalent at baseline. Analyses 
in the comprehension domain for seventh-grade Hispanic students did not meet WWC group design standards because, due to high 
student attrition, equivalence of the analytic intervention and comparison group is necessary and not demonstrated. 

Recommended Citation 

U.S. Department of Education, Institute of Education Sciences, What Works Clearinghouse. (2015, November). 
Adolescent Literacy intervention report: SuccessMaker®. Retrieved from http://whatworks.ed.gov 
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WWC Rating Criteria 


Criteria used to determine the rating of a study 

Study rating 

Criteria 

Meets WWC group design 
standards without reservations 

A study that provides strong evidence for an intervention’s effectiveness, such as a well-implemented RCT. 

Meets WWC group design 
standards with reservations 

A study that provides weaker evidence for an intervention’s effectiveness, such as a QED or an RCT with high 
attrition that has established equivalence of the analytic samples. 

Criteria used to determine the rating of effectiveness for an intervention 

Rating of effectiveness 

Criteria 

Positive effects 

Two or more studies show statistically significant positive effects, at least one of which met WWC group design 
standards for a strong design, AND 

No studies show statistically significant or substantively important negative effects. 

Potentially positive effects 

At least one study shows a statistically significant or substantively important positive effect, AND 

No studies show a statistically significant or substantively important negative effect AND fewer or the same number 

of studies show indeterminate effects than show statistically significant or substantively important positive effects. 

Mixed effects 

AAt least one study shows a statistically significant or substantively important positive effect AND at least one study 
shows a statistically significant or substantively important negative effect, but no more such studies than the number 
showing a statistically significant or substantively important positive effect, OR 

At least one study shows a statistically significant or substantively important effect AND more studies show an 
indeterminate effect than show a statistically significant or substantively important effect. 

Potentially negative effects 

One study shows a statistically significant or substantively important negative effect and no studies show a statistically 
significant or substantively important positive effect, OR 

Two or more studies show statistically significant or substantively important negative effects, at least one study shows 
a statistically significant or substantively important positive effect, and more studies show statistically significant or 
substantively important negative effects than show statistically significant or substantively important positive effects. 

Negative effects 

Two or more studies show statistically significant negative effects, at least one of which met WWC group design 
standards for a strong design, AND 

No studies show statistically significant or substantively important positive effects.. 

No discernible effects 

None of the studies shows a statistically significant or substantively important effect, either positive or negative. 

Criteria used to determine the extent of evidence for an intervention 

Extent of evidence 

Criteria 

Medium to large 

The domain includes more than one study, AND 
The domain includes more than one school, AND 

The domain findings are based on a total sample size of at least 350 students, OR, assuming 25 students in a class, 
a total of at least 14 classrooms across studies. 

Small 

The domain includes only one study, OR 
The domain includes only one school, OR 

The domain findings are based on a total sample size of fewer than 350 students, AND, assuming 25 students 
in a class, a total of fewer than 14 classrooms across studies. 
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Glossary of Terms 

Attrition 

Clustering adjustment 
Confounding factor 

Design 
Domain 
Effect size 

Eligibility 

Equivalence 

Extent of evidence 

Improvement index 

Intervention 
Intervention report 


Multiple comparison 
adjustment 

Quasi-experimental 
design (QED) 

Randomized controlled 
trial (RCT) 

Rating of effectiveness 


Single-case design 


Attrition occurs when an outcome variable is not available for all participants initially assigned 
to the intervention and comparison groups. The WWC considers the total attrition rate and 
the difference in attrition rates across groups within a study. 

If intervention assignment is made at a cluster level and the analysis is conducted at the student 
level, the WWC will adjust the statistical significance to account for this mismatch, if necessary. 

A confounding factor is a component of a study that is completely aligned with one of the 
study conditions, making it impossible to separate how much of the observed effect was 
due to the intervention and how much was due to the factor. 

The design of a study is the method by which intervention and comparison groups were assigned. 
A domain is a group of closely related outcomes. 

The effect size is a measure of the magnitude of an effect. The WWC uses a standardized 
measure to facilitate comparisons across studies and outcomes. 

A study is eligible for review and inclusion in this report if it falls within the scope of the 
review protocol and uses either an experimental or matched comparison group design. 

A demonstration that the analysis sample groups are similar on observed characteristics 
defined in the review area protocol. 

An indication of how much evidence supports the findings. The criteria for the extent 
of evidence levels are given in the WWC Rating Criteria on p. 19. 

Along a percentile distribution of individuals, the improvement index represents the gain 
or loss of the average individual due to the intervention. As the average individual starts at 
the 50th percentile, the measure ranges from -50 to +50. 

An educational program, product, practice, or policy aimed at improving student outcomes. 

A summary of the findings of the highest-quality research on a given program, product, 
practice, or policy in education. The WWC searches for all research studies on an interven- 
tion, reviews each against design standards, and summarizes the findings of those that 
meet WWC design standards. 

When a study includes multiple outcomes or comparison groups, the WWC will adjust 
the statistical significance to account for the multiple comparisons, if necessary. 

A quasi-experimental design (QED) is a research design in which study participants are 
assigned to intervention and comparison groups through a process that is not random. 

A randomized controlled trial (RCT) is an experiment in which eligible study participants are 
randomly assigned to intervention and comparison groups. 

The WWC rates the effects of an intervention in each domain based on the quality of the 
research design and the magnitude, statistical significance, and consistency in findings. The 
criteria for the ratings of effectiveness are given in the WWC Rating Criteria on p. 19. 

A research approach in which an outcome variable is measured repeatedly within and 
across different conditions that are defined by the presence or absence of an intervention. 
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Glossary of Terms 


Standard deviation The standard deviation of a measure shows how much variation exists across observations 
in the sample. A low standard deviation indicates that the observations in the sample tend 
to be very close to the mean; a high standard deviation indicates that the observations in 
the sample tend to be spread out over a large range of values. 

Statistical significance Statistical significance is the probability that the difference between groups is a result of 

chance rather than a real difference between the groups. The WWC labels a finding statistically 
significant if the likelihood that the difference is due to chance is less than 5% (p < .05). 


Substantively important a substantively important finding is one that has an effect size of 0.25 or greater, regardless 

of statistical significance. 

Systematic review a review of existing literature on a topic that is identified and reviewed using explicit 

methods. A WWC systematic review has five steps: 1) developing a review protocol; 2) 
searching the literature; 3) reviewing studies, including screening studies for eligibility, 
reviewing the methodological quality of each study, and reporting on high quality studies 
and their findings; 4) combining findings within and across studies; and, 5) summarizing 
the review. 


Please see the WWC Procedures and Standards Handbook (version 3.0) for additional details. 
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Intervention 

Report 



Practice 

Guide 



Quick 

Review 


Single Study 
Review 



An intervention report summarizes the findings of high-quality research on a given program, practice, or policy in 
education. The WWC searches for all research studies on an intervention, reviews each against evidence standards, 
and summarizes the findings of those that meet standards. 


This intervention report was prepared for the WWC by Mathematica Policy Research under contract ED-IES-13-C-0010. 
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